Phoneme Recognition: Neural Networks vs Hidden Markov Models
نویسنده
چکیده
neme recognition which is characterized by two important properties: 1.) Using a 3 layer arrangement of simple computing units, it can represent arbitrary nonlinear decision surfaces. The TDNN learns these decision surfaces automatically using error back-propagatioii[l]. 2.) he time-delay arrangement enables the network to discover acoustichonetic features and the temporal relationships between them indeendent of position in time and hence not blurred by temporal shifts in the input. For comparison, several discrete Hidden Markov Models (HMM) were trained to perform the same task, i.e., the speakerdependent recognition of the phonemes "B", "D", and "G" extracted We show that the TDNN "invented" well-known acoustic-phonetic to the same concept.
منابع مشابه
Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM
Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...
متن کاملModelling of Deterministic, Fuzzy and Probablistic Dynamical Systems
Recurrent neural networks and hidden Markov models have been the popular tools for sequence recognition problems such as automatic speech recognition. This work investigates the combination of recurrent neural networks and hidden Markov models into the hybrid architecture. This combination is feasible due to the similarity of the architectural dynamics of the two systems. Initial experiments we...
متن کاملSelected Papers of the Thirteenth International Conference on Computer and
— This paper describes an evaluation of Inhibition/Enhancement (In/En) network for robust automatic speech recognition (ASR). In distinctive phonetic features (DPFs) based speech recognition using neural network, In/En network is needed to discriminate whether the DPFs dynamic patterns of trajectories are convex or concave. The network is used to achieve categorical DPFs movement by enhancing ...
متن کاملEstimating phoneme class conditional probabilities from raw speech signal using convolutional neural networks
In hybrid hidden Markov model/artificial neural networks (HMM/ANN) automatic speech recognition (ASR) system, the phoneme class conditional probabilities are estimated by first extracting acoustic features from the speech signal based on prior knowledge such as, speech perception or/and speech production knowledge, and, then modeling the acoustic features with an ANN. Recent advances in machine...
متن کاملSelected Papers of the IEEE International Conference on Computer and Information Technology
This paper presents a distinctive phonetic features (DPFs) based phoneme recognition method by incorporating syllable language models (LMs). The method comprises three stages. The first stage extracts three DPF vectors of 15 dimensions each from local features (LFs) of an input speech signal using three multilayer neural networks (MLNs). The second stage incorporates an Inhibition/Enhancement (...
متن کامل